AITopics | probability mass

Collaborating Authors

probability mass

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity

Lochab, Anamika, Li, Bolian, Zhang, Ruqi

arXiv.org Machine LearningMay-4-2026

Reinforcement Learning with Verifiable Rewards (RLVR) has achieved substantial gains in single-attempt accuracy (Pass@1) on reasoning tasks, yet often suffers from reduced multi-sample coverage (Pass@K), indicating diversity collapse. We identify a structural cause for this degradation: common RLVR objectives, such as GRPO, are indifferent to how probability mass is distributed among correct solutions. Combined with stochastic training dynamics, this indifference induces a self-reinforcing collapse, in which probability mass concentrates on a narrow subset of correct outputs while alternative valid solutions are suppressed. We formalize this collapse mechanism and further characterize the optimal policy structure under two complementary criteria: robustness and entropy-regularized optimality, which identify the Uniform-Correct Policy as uniquely optimal. Motivated by this analysis, we propose Uniform-Correct Policy Optimization (UCPO), a modification to GRPO that adds a conditional uniformity penalty on the policy's distribution over correct solutions. The penalty redistributes gradient signal toward underrepresented correct responses, encouraging uniform allocation of probability mass within the correct set. Across three models (1.5B-7B parameters) and five mathematical reasoning benchmarks, UCPO improves Pass@K and diversity while maintaining competitive Pass@1, achieving up to +10\% absolute improvement on AIME24 at Pass@64 and up to 45\% higher equation-level diversity within the correct set. The code is available at https://github.com/AnamikaLochab/UCPO.

correct solution, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2605.00365

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

Center Smoothing: Certified Robustness for Networks with Structured Outputs Appendix

Neural Information Processing SystemsApr-25-2026, 06:52:48 GMT

Let, y be a point in that intersection. Since, by definition, ˆr(x0,) is the radius of the smallest ball with 1/2 + probability mass of f(x0 + P) over all possible centers in Rk and ˆRis the radius of the smallest such ball centered at ˆf(x), we must have ˆr(x0,) ˆR. Consider the smallest ball B(z0,ˆr(x, 1)) that encloses at least 1/2 + 1 probability mass of f(x+ P). Since, r is the radius of the minimum enclosing ball that contains at least half of the points in Z, we have r ˆr(x, 1). Now, using the definition of ˆRand following the same reasoning as theorem 2, we can say that, d( ˆf(x), ˆf(x0)) βˆr(x0,) + ˆR (1 + β) ˆR.

artificial intelligence, certificate, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Center Smoothing: Certified Robustness for Networks with Structured Outputs

Neural Information Processing SystemsApr-25-2026, 06:52:45 GMT

The study of provable adversarial robustness has mostly been limited to classification tasks and models with one-dimensional real-valued outputs. We extend the scope of certifiable robustness to problems with more general and structured outputs like sets, images, language, etc.

artificial intelligence, international conference, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.93)
North America > Canada (0.69)
Europe (0.68)

Genre: Research Report (0.47)

Industry:

Health & Medicine > Diagnostic Medicine (0.46)
Information Technology > Security & Privacy (0.30)
Government > Military (0.30)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Supplementary Material Proofs from Section 2

Neural Information Processing SystemsApr-24-2026, 18:30:00 GMT

The proof of Claim 2.3 is obtained via the following calculation, using the definition of Hermite tensor (Definition 2.2). We will use i,j for indexes in [d]. The above is equivalent to Hk(Bx) = B kHk(x). We construct the truncated distribution A as follows. We first sample x A, then we reject x unless x 2 B. Let A be the distribution of the samples we get from this process. Using Markov's inequality and union bound, we have Then it only remains to verify Ex A [Hk(x)] Ex Nm[Hk(x)] 2 for any k < d.

ak 2, artificial intelligence, lemma 3, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits

Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, Robert E. Schapire

Neural Information Processing SystemsApr-22-2026, 10:34:20 GMT

We propose a new oracle-based algorithm, BISTRO+, for the adversarial contextual bandit problem, where either contexts are drawn i.i.d. or the sequence of contexts is known a priori, but where the losses are picked adversarially.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback

Computational and Statistical Hardness of Calibration Distance

Qiao, Mingda

arXiv.org Machine LearningMar-20-2026

The distance from calibration, introduced by Błasiok, Gopalan, Hu, and Nakkiran (STOC 2023), has recently emerged as a central measure of miscalibration for probabilistic predictors. We study the fundamental problems of computing and estimating this quantity, given either an exact description of the data distribution or only sample access to it. We give an efficient algorithm that exactly computes the calibration distance when the distribution has a uniform marginal and noiseless labels, which improves the $O(1/\sqrt{|\mathcal{X}|})$ additive approximation of Qiao and Zheng (COLT 2024) for this special case. Perhaps surprisingly, the problem becomes $\mathsf{NP}$-hard when either of the two assumptions is removed. We extend our algorithm to a polynomial-time approximation scheme for the general case. For the estimation problem, we show that $Θ(1/ε^3)$ samples are sufficient and necessary for the empirical calibration distance to be upper bounded by the true distance plus $ε$. In contrast, a polynomial dependence on the domain size -- incurred by the learning-based baseline -- is unavoidable for two-sided estimation. Our positive results are based on simple sparsifications of both the distribution and the target predictor, which significantly reduce the search space for computation and lead to stronger concentration for the estimation problem. To prove the hardness results, we introduce new techniques for certifying lower bounds on the calibration distance -- a problem that is hard in general due to its $\textsf{co-NP}$-completeness.

artificial intelligence, caldistd, machine learning, (18 more...)

arXiv.org Machine Learning

2603.18391

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)

Add feedback

Robust Hypothesis Testing Using Wasserstein Uncertainty Sets

RUI GAO, Liyan Xie, Yao Xie, Huan Xu

Neural Information Processing SystemsMar-16-2026, 01:06:44 GMT

We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered around the empirical distribution defined via Wasserstein metric, thus our approach is data-driven and free of distributional assumptions. We develop a convex safe approximation of the minimax formulation and show that such approximation renders a nearly-optimal detector among the family of all possible tests. By exploiting the structure of the least favorable distribution, we also develop a tractable reformulation of such approximation, with complexity independent of the dimension of observation space and can be nearly sample-size-independent in general. Real-data example using human activity data demonstrated the excellent performance of the new robust detector.

artificial intelligence, detector, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: